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(57) Abstract 
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coSSS to^a V £o frame of the visual infonnation. The digital data stream is generated witb an encoder. The encoder generates tag 
2St"^di^tL «an^^^^ frame data within the di^ical data stream. Ttie digital data stream is stored at a locauon from which 

t dS^rSl^Tt^^ r^^^^^^^ to a client. Tag data is stored at a location ftom which the tag data may be used f^PJ^^^ ^'^^^^^^^ 
n^n-SnUa^ ^ t^ the digital data stream. A selected set of video frames widiin the digital data stream is selected ^'^^ 
S^^^i^Sonse Ta request f<? non-scqucntial access by the client A seco.^;^gital ^a^^"^^^ ^ ^ 

coirespontb to each video frame of the selected set of video frames » constnictcd and transmitted to the client. 
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METHOD AND APPARATUS FOR CONCURRENTLY ENCODING AND TAGGING DIGITAL 
VIDEO DATA 

RELATED APPUCATIONS 

The present application is related to: U.S. Patent Application No. 08/956,263, 
entitled "METHOD AND APPARATUS FOR NON-SEQUENTIAL ACCESS TO AN IN- 
PROGRESS VIDEO FEED" filed by Daniel Weaver, Mark A. Porter and David J. 
Pawson", on 22 October 1997, (anomey docket no. 3018-128) the contents of which are 
incorporated herein by reference, 

U.S. Patent Application No. 08/956,262, entitled "METHOD AND APPARATUS 
FOR IMPLEMENTING SEAMLESS PLAYBACK OF CONTINUOUS MEDL^ FEEDS" . 
filed by Daniel Weaver and David J. Pawson on the 22 October 1997, (attorney docket 
no. 301 8-101) the contents of which are incorporated herein by reference. 

FIELD OF THE INVENTION 

The present invention relates to a method and apparatus for processing audio-visual 
information, and more specifically, to a method and apparatus for providing non-sequential 
access to audio-visual information represented in a live content stream. 

BACKGROUND OF THE INVENTION 

In recent years, the media industry has expanded its horizons beyond traditional • 
analog technologies. Audio, photographs, and even feature films are now being recorded 
or converted into digital formats. To encourage compatibility between products, standard 
formats have been developed in many of the media categories. 

As would be expected, the viewers of digital video desire the same functionality 
from the providers of digital video as they now enjoy while watching analog video tapes on 
video cassette recorders. For example, viewers want to be able to make the video jump 
ahead, jump back, fast forward, fast rewind, slow forward, slow rewind and freeze frame. 
.Various approaches have been developed to provide non-sequential playback of 
30 digital video data. With respect to digital video data, non-sequential playback refers to any 
playback operation that does not play all of the encoded frames in the exact order in the 
sequence in which they were encoded. For example, jump ahead and fast forward 
operations are non-sequential in that some frames are skipped. Rewind operations at any 
speed are non-sequential in that during a rewind operation, frames are not played in the 
35 sequence in which they are encoded. 

One approach to providing non-sequential playback of digital video data, referred to 
herein as the tag-based approach, is described in U.S. Patent No. 5,659,539, entitled 
"Method and Apparatus for Frame Accurate Access of Digital Audio-visual Information" 
issued to Poner et al on August 19, 1997, the contents of which are incoiporated herein by 
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this reference. According to the tag-based approach, a stored digital video file is parsed to 
generate "tag information" about individual frames within the file. 

Specifically, the tag file contains information about the state of one or more state 
machines that aic used to decode the digital representation. The state infomiation varies 
5 depending on the specific technique used to encode the audio-visual work. For MPEG-2 
files, for example, the t^ file includes information about the state of the program 
elementary stream state machine, the video state machine, and the transport layer state 
machine. 

During the performance of the audio-visual work, data from the digital 
1 0 representation is sent from a video pump to a decoder. The information in the tag file is 
used to perform seek, fast forward, fast rewind, slow forward and slow rewind operations 
during the performance of the audio-visual work. Seek operations are perforated by 
causing the video pump to stop transmitting data from the current position in the digital 
representation, and to start transmitting data from a new position in the digital 
1 5 representation. The information in the tag file is inspected to determine the new position 
from which to start transmitting data. To ensure that the data stream transmitted by the 
video pump maintains compliance with the applicable video format, prefix data that 
includes appropriate header information is transmitted by the video pump prior to 
transmitting data from the new position. 
20 Fast forward, fast rewind, slow forward and slow rewind operations are performed - 

by selecting video frames based on the information contained in the tag file and the desired 
presentation rate, and generating a data stream containing data that represents the selected 
video fi^es. The selection process takes into account a variety of factors, including the 
data transfer rate of the channel on which the data is to be sent, the frame type of the 
25 frames, a minimum padding rate, and the possibility of a buffer overflow on the decoder. 
Prefix and suffix data are inserted into the transmitted data stream before and after the data 
for each frame in order to maintain compUance with the data stream format expected by the 
decoder. 

The tag-based approach works well when there is enough time between the creation 
30 of the original digital video stream and the viewing of the digital video stream to allow the 
original digital video stream to be parsed to generate tag information. However, when the 
digital video stream is being viewed as it is being generated, parsing the digital video 
stream becomes impractical. The amount of computational power required to parse the 
digital video stream as it arrives would be prohibitively expensive. On the other hand, it is 
35 not considered acceptable to increase the latency between the occurrence of many types of 
video feeds (e.g. sporting events) and the time at which such feeds are available for 
audience viewing. 

When a video stream is made available for viewing before generation of the stream 
has been completed, the video stream is said to be a " live feed" . At a professional level, 
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non<linear digital editors can be used to rapidly review footage of a live feed for a single 
user. However, these systems are not intended for and cannot be easily adapted to serve 
many users. For example, if a hundred users were watching the same live feed but wanted 
to rewind, pause, and &st forward the feed at different times, each would require a separate 

5 non-linear digital editor. 

Another problem associated with providing non-linear access to live digital video 
streams is that users may attempt to fast forward into portions of the video stream that do 
not yet exist. For example, a viewer may attempt to fast forward a live feed to see the final, 
score of game which, in reality, has not yet ended. It is desirable to provide techniques for 

10 handling these types of situations in a way that ensures that the decoder will not freeze nor 
the video stream become comipted. 

Based on the foregoing, it is clearly desirable to provide a method and apparatus for 
sequentially displaying non-sequential frames of a live digital video. It is further desirable 
to provide such non-sequential access to live digital video in a way that does not require 

15 each viewer to operate prohibitively expensive hardware. It is also desirable to provide 
safeguards against attempts to access portions of a live digital video stream that do not yet 
exist. 

SUMMARY OF THE INVENTION 
20 A method and system for providing non-sequential access to visual information that 

is being digitally encoded in a digital data stream is provided. The digital data stream 
includes a sequence of video frame data. Each video frame data in the sequence of video 
frame data corresponds to a video frame of the visual information. 

The digital data stream is generated with an encoder. The encoder generates tag 
25 data that indicates locations of the video frame data within the digital data stream. The 
digital data stream is stored at a location from which the digital data stream is delivered to 
a cUent. Tag data is stored at a location from which the tag data may be used to provide the 
client non-sequential access.to the digital data stream. 

A selected set of video frames within the digital data stream is selected based on the 
30 tag data in response to a request for non-sequential access by the client: A second digital 
data stream that includes the video frame data that corresponds to each video frame of the 
selected set of video frames is constructed and transnoitted to the client. 

According to another aspect of the invention, the encoder includes a real-time 
CODEC that generates digital information in response to the visual information and a 
35 multiplexisr coupled to the real-time CODEC. The multiplexer arranges the digital 

information generated by the real-time CODEC according to a digital video format The 
multiplexer generates the tag data to indicate how the multiplexer arranged the digital 
information. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention is illustrated by way of example, and not by way of 
limitation, in the figures of the accompanying drawings and in which like reference 
numerals refer to similar elements and in which: 

Figure 1 is a block diagram that illustrated a video delivery system according to an 
embodiment of the invention; 

Figure 2A is a block diagram that illustrates the format of an MPEG file; 

Figure 2B is a block diagram of an exemplary tag file according to an embodiment 
of the invention; 

Figure 2C is a block diagram illustrating the tag information generated for each 
frame in an MPEG-1 file according to an embodiment of the invention; 

Figure 3 A is a block diagram illustrating a storage system that uses RAID error 
correction techniques according to an embodiment of the invention; 

Figure 3B is a block diagram illustrating a storage system that combines RAID 
error correction and disk striping according to an embodiment of the invention; 

Figure 4 is a block diagram illustrating a series of content files used to store the 
content of a continuous feed according to an embodiment of the invention; and 

Figure 5 is a block diagram illustrating the migration of tag information from an old 
tag file to a new tag file in response to the expiration of tag data within the old tag file. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT 

A method and apparatus for providing non-sequential access to a live digital video 
stream is described. In the following description, for the purposes of explanation, 
ntimerous specific details are set forth in order to provide a thorough understanding of the 
present inyention. It will be apparent, however, to one skilled in the art that the present 
invention may be practiced without these specific details. In other instances, well-known 
structures and devices are shown in block diagram form in order to avoid unnecessarily 
obscuring the present invention. 

FUNCTION AL OVERVIEW 
According to one aspect of die invention, the difficulty associated with applying the 
tag-based approach to live digital video feeds is addressed by eliminating the need to parse 
an incoming digital video stream in real time. Instead of generating tag data by parsing the 
digital video stream, the unit responsible for encoding the live feed retains information 
about how the data was encoded and transmits that information to the video server along 

with the encoded data. The tag information arrives at the video server along with the 

< 

corresponding content, so the content itself does not have to be parsed. 

According to another aspect of the invention, the video server is configured to 
ensure that the chent cannot seek or scan past the end of the received content. Due to the 
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fact that there will be some amount of skew between the arrival time of the content and the 
corresponding tags, the server is configured to make sure that tags are not used 
prematurely, i.e. such that they would cause the server to go past the end of the available 
content. 

5 

EXEMPLARY SYSTEM 
Figure 1 is a block diagram illustrating an exemplary audio-visual inforaiation 
delivery system 100 for delivering and providing non-sequential access to live digital video 
feeds. Audio- visual infomiation delivery system 100 generally includes an encoder 101, a 
10 video server 106, a Media Data Store (MDS) 1 12, a database 1 16, a stream server 118, a 
video pump 120, and a client 122. 

THE ENCODER 

Encoder 101 receives audio visual input and generates a digital stream of data that 
1 5 encodes the audio visual input according to a particular format. Numerous video encoding 
formats have been developed and are well known in the industry. For example, the MPEG 
formats are described in detail in the following international standards: ISOAEC 13818-1, 
2, 3(MPEG-2)andISO/IEC11172-l,2, 3 (MPEG-1). Documents that describe these 
standards (hereafter referred to as the "MPEG specifications") are available from ISO/TEC 
20 Copyright Office Case Postale 56, CH 121 1, Geneve 20, Switzerland. While specific 

formats may be referenced herein for the purposes of explanation, the present invention is 
not restricted to any particular digital stream format. 

Encoder lOl includes a Coder/Decoder (CODEC) 102 and a multiplexer (MUX ) 
104. CODEC 102 converts visual or audio- visual information from an input source to 
25 compressed digital data. CODEC 102 may be, for example, a fractal compressor or an 
MPEG compressor. For the purposes of illustration, it shall be assumed that the video 
source being captured by CODEC 102 is a live source and, consequently, CODEC 102 is 
encoding video at IX relative to real time. However, the video source may alternatively be 
a stored video source which CODEC 102 encodes at any rate relative to real time. 
30 MUX 104 multiplexes the compressed audio and visual information generated by 

CODEC 102 to generate a compressed video stream. In the compressed video stream, the 
data representing video frames and audio are merged and formatted according to the 
particular digital format supported by encoder 101. The specific operations performed 
during the merging process will vary based on the type of encoding employed. For 
35 example, the merging process may involve determining the order and placement of 

portions of digitized audio and video in the stream and inserting metadata at various points 
within the stream. The metadata may take the form, for example, of header information 
that identifies the starting point and content of "packets" within die stream. The stream of 
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compressed audio-visual infonnatioii constructed by MUX 104 is transmitted from the 
encoder 101 to the video server 106 over a communication channel 128. 

CONTROL INFORMATION 
5 According to one aspect of the invention, the encoder 101 sends control infonnation 

to the video server 106 over a communication channel 130 in paraUel with the video 
' stream. The control information sent over channel 130 includes specific infonnation about 
how the encoder 101 constructed the video stream. This control infonnation includes tag. 
data that will be used by the stream server 118 to provide non-sequential access to the 
10 -video stream. Specifically, the control infonnadon may include infonnation about the 
type, length, and boundaries of the various frames encoded in the video stream as well as 
header infonnation that specifies the compression ratio, the bit rate, and other types of 
infonnation the video server 106 requires to detennine how to process the video stream. 

Significantly, the generation of the control infonnation involves minimal additional 
15 computational power because MUX 104 generates most of the infonnation already during 
the constniction of the content stream. Specifically, MUX 104 arranges and encapsulates 
the digital video and audio data from CODEC 102. Since MUX 104 is packaging the 
content, MUX 1 04 knows the contents of and boundaries between the packages. 

20 COMMUNICATION BETWEEN THE ENCODER 

AND THE VIDEO SERVER 
While CODEC 102 will typically be implemented in hard-wired circuitry, MUX 
104 is preferably implemented by program-controlled circuitry, such as a processor 
programmed to execute a particular sequence of instructions that are stored in a memory. 
25 Consequemly, MUX 104 may include a processor executing a conventional multiplexing 
program that has been linked with and makes calls to a software Ubrary that controls the 
communication with the video server 106. 

All data transmitted between the encoder lOl and the video server 106 is preferably 
sent using a reliable communication mechanism. According to one embodiment, the video 
30 content on communication channel 128 is handled as a simple bytestream and is 

transmitted via a Ughtweight, reUable protocol For example. TCP is sufficient on lightly 
loaded networks. The control information and metadata on communication channel 130 
contain more compUcated data types and are sent via an object oriented protocol, such as 
the Common Object Resource Broker Architecture Interfece Definition Language 

35 ("CORBAIDL"). 

Communication between the encoder 101 and the video server 106 occurs in 
sessions. According to one embodiment, a session is perfonned in three phases: OPEN, 
SEND and CLOSE. The operations perfonned during each of the phases is as follows: 
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OPEN - any provisioning that the video server 106 needs to perform for network or 
disk bandwidth or disk space occurs. A pipe for the video stream data (the "content**) is 
created. 

SEND TAGS and SEND DATA - these caQs are made multiple times as content is 
encoded. The video server 106 stores all content immediately to disk and updates an end- 
of-file position. Tags are held in memory until the accompanying content data has been 
stored. Tags are held for an additional period of time to guarantee that a seek to that tag 
will succeed, i.e. that video pump 120 will not starve for data. 

CLOSE - content pipe is torn down. Server resources are released and content 
services and clients are notified that the feed has become a normal static piece of content. 

Encoder 101 generates content data and control data in parallel. However, the 
control data associated with a particular portion of content is not necessarily generated by 
encoder 101 at the same time as the particular portion of content. For example,, encoder 
101 may acttialiy determine how it is going to line up content frames before the encoder 
15 101 actually lines up the firames. Under these circumstances, the control data that indicates 
how the frames are lined up may be transmitted by encoder 101 before the content data that 
contains the frames. 



THE VIDEO SERVER 

20 Video server 106 receives the video stream and control data from encoder 101 and 

causes the data to be stored in MDS 1 12. In the illustrated. system, the video server 106 
sends an MPEG video stream to MDS server 110, and MDS server 1 10 stores the MPEG 
video stream in an MPEG file 134. In parallel, the video server 106 sends to the MDS 
server 1 10 1^ information extracted firom the control data received on line 130. The tag 
25 data is stored in a tag file 132 on disks 1 14. The video server 106 may also send 
information about the content, including tag data, to be stored in database 1 16. 

Once tag data is transmitted by video server 106, any other entity in the system, 
including video pump 120, may use the tag data to attempt to access the content associated 
with the tag data. Consequently, the immediate transmission of tag data to MDS server 
30 1 10 may result in errors when, for example, the tag data arrives at video server 106 before 
the corresponding content data. Therefore, prior to sending the tag data to MDS server 
1 10, video server 106 buffers each tag data item in a tag buffer 108 until it is safe for 
entities, such as video pump 120, to access the content associated with the tag data item. 
The use of tag buffer 108 to avoid premature reads of content data is described in greater 
35 detail hereafter. 

EXEMPLARY MPEG FILE 
Digital audio-visual storage formats, whether compressed or not, use state machines 
and packets of various strucmres. The techniques described herein apply to all such storage 
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fonnats. While the present invention is not limited to any particular digital audio-visual 
format, the MPEG-2 transport file structure shall be described for the purposes of 
Qlustration. 

Refenring to Figure 2a, it illustrates the structure of an MPEG-2 transport file 1 34 in 
5 greater detail. The data within MPEG file 134 is packaged into three layers: a program 
elementary stream ("PES") layer, a transport layer, and a video layer. These layers are 
described in detail in the MPEG-2 specifications. At the PES layer, MPEG file 1 34 
consists of a sequence of PES packets. At the transport layer, the MPEG file 134 consists 
of a sequence of transport packets. At the video layer, MPEG file 134 consists of a 
10 sequence of picture packets. Each picture packet contains the data for one fiame of video. 

Each PES packet has a header that identifies the length and contents of the PES 
. - packet. In the illustrated example, a PES packet 250 contains a header 248 followed by a 
sequence of transport packets 251-262. PES packet boundaries coincide with valid 
transport packet boundaries. Each transport packet contains exclusively one type of data. 
15 In the illustrated example, transport packets 25 1, 256, 258, 259, 260 and 262 contain video 
data. Transport packets 252, 257 and 261 contain audio data. Transport packet 253 
contains control data. Transport packet 254 contains timing data. Transport packet 255 is 
a padding packet. 

Each transport packet has a header. The header includes a program ID ("PID") for 
20 the packet. Packets assigned PID 0 are control packets. For example, packet 253 may be 
assigned PID 0. Other packets, including other control packets, are referenced in the PID 0 
packets. Specifically, PID 0 control packets include tables that indicate the packet types of 
the packets that immediately follow the PID 0 control packets. For ail packets which are 
not PID 0 control packets, the headers contain PIDs which serve as a pomters into the table 
25 contained in the PID 0 control packet that most immediately preceded. the packets. For 
example, the type of data contained in a packet with a PID 100 would be determined by • 
inspecting the entry associated with PID 100 in the table of the PID 0 control packet that 
most recently preceded the packet 

In the video layer, the MPEG file 134 is divided according to the boundaries of 
30 fi^me data. As mentioned above,'there in no correlation between the boimdaries of the 
data that represent video frames and the transport packet boundaries. In the illustrated 
example, the frame data for one video frame "F" is located as indicated by brackets 270. 
Specifically, the fi^e data for frame "F" is located fi:om a point 280 within video packet 
251 to the end of video packet 251, in video packet 256, and from the beginning of video 
35 packet 258 to a point 282 within video packet 258. Therefore, points 280 and 282 

represent the boundaries for the picture packet for frame "F". The fi^me data for a second 
video frame "G" is located as indicated by brackets 272. The boundaries for the picture , 
packet for frame "G" are indicated by bracket 276. 
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Structures analogous to those described above for MPEG-2 transport streams also 
exist in other digital audio-visual storage formats, including MPEG- 1, Quicktime, AVI, 
Proshare and H.261 formats. In the preferred embodiment, indicators of video access 
points, time stamps, file locations, etc. are stored such that multiple digital audio-visual 
storage formats can be accessed by the same server to simultaneously serve different 
clients from a wide variety of storage foraiats. Preferably, all of the format specific 
information and techniques are incorporated in the tag generator and the stream server. All 
of the other elements of the server are format independent. 

EXEMPLARY TAG FILE 
The contents of an exemplary tag file 132 shall now be described with reference to 
Figure 2b. In Figure 2b, the tag file 132 includes a file type identifier 202, a length 
indicator 204, a bit rate indicator 206, a play duration indicator 208, a fiame number 
indicator 210, stream access information 212 and an initial MPEG time offset 213. File 
type identifier 202 indicates the physical wrapping on the MPEG file 134. For example, 
file type identifier 202 would indicate whether MPEG file 134 is a MPEG-2 or an MPEG-l 
file. * 

Length indicator 204 indicates the length of the MPEG file 134. Bit rate indicator 
206 indicates the bit rate at which the contents of the MPEG file 134 should be sent to a 
client during playback. The play duration indicator 208 specifies, in milliseconds, the 
amount of time required to play back the entire contents of MPEG file 134 during a normal 
playback operation. Frame number indicator 210 indicates the total number of frames 
represented in MPEG file 1 34. 

Stream access information 212 is information required to access the video and 
25 audio streams stored within MPEG file 134. Stream access information 212 includes a 

video elementary stream ID and an audio elementary stream ED. For MPEG-2 files, stream 
access information 212 also includes a video PID and an audio P£D. The tag file header 
may also contain other information that may be used to implement other features. 

In addition to the general information described above, the tag file 132 contains an 
30 entry for each frame within the MPEG file 134. The entry for a video frame includes 
' information about the state of the various MPEG layers relative to the position of the data 
that represents the frame. For an MPEG-2 file, each entry includes the state of the MPEG- 
2 transport state machine, the state of the program elementary stream state machine and the 
state of the video state machine. For an MPEG-l file, each entry includes the current state 
35 of the Pack system MPEG stream and the state of the video state machine. 

Tag file entry 214 illustrates in greater detail the tag information that is stored for 
an individual MPEG-2 video firame "F". With respect to the state of the program 
elementary stream state machine, the tag entry 214 includes the information indicated in 
Table I. 
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TABLE 1 



DA'IA 


MEANING 


PES Ot't SKl" A'l' I'Hl; S i ART OF 
PICTURE 217 


1 he ouset, witnm me fbb packet that 
contains the frame data for frame "P of 

Aret nfth^ ft-orrip Hntsi for {ramp 
inC uTSl oyxS OI UlC UOUIC Uaui lUi UoLUw 

"F". 


TES OFFSET AT' IHE END OF 
PICTURE 219 


Ihe otlset between the last byte m the 
frame data for frmie "F" and the end of 
the PES packet in which the fi^e data 
for frame "F" resides. 



With respect to the state of the video state machine, tag entry 214 includes the 
infomiation indicated in Table 2. 



TABLE 2 



UAIA 


MEANINU 


PICTURE SIZE 'm 


Itie size ot the picmre packet tor trame 
iipt 


START POSiriON 216 


The location withm the MFhU tiie ot 
the first byte of the data that corresponds 
to frame "F" 


TIME VALUE 22ii 


The time, relative.to me beginmng ot 
the movie, when frame "F" would be 
displayed during a nomfial playback of 
MPEG file 134. 


FRAME i'yPE2J2 


1 he tectmique used to encode me trame 
(e.g. I-frame, P-frame or B-frame). 


TIMING BUFFER INFORMATION 
238 


indicates how tnll the butter ot the 
decoder is (sent to the decoder to 
determine when inforaiation should be 
moved out of the buffer in order to 
receive newly arriving information). 



With respect to the state of the transport layer sute machine, tag entry 214 includes 
the information indicated in Table 3., 
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TABLE3 



10 



DAi'A 


MHANiNG 


START 01'i'S£T234 


The distance between the ot the nrst 
byte in the frame data and the start of 
the transport packet in which the first 
byte resides. 


# OF NON-VIDKO PACKb'i'S m 


1 he number ot non- video packets (i.e. 
audio packets, padding packets, control 
packets and timing packets) that are 
located within the picture packet for 
frame "F". 


U OF PADUING PACKKi b 224 


1 he number ot paddmg packets that are 
located within the picture packet for 
frame "F". 


END OFFSET 236 


i he distance between the last byte m the 
frame data and the end of the packet in 

which the last byte resides. 


(JUKKbNl UUNllNUliY LUUNltK 
215 


i he CJontmuity value associated with 
frame "F". 


DTSCONTLSUl'l Y FLAG 230 


Indicates whether there is a 
discontinuity in time between frame "F" 
and the frame represented in the 
previous tag entry. 



Assume, for example, that entry 214 is for the frame "F" of Figure 2b. The size 220 
associated with frame would be the bits encompassed by bracket 274. The number 222 
of non-video packets would be five (packets 252, 253, 254, 255 and 257). The number 224 
of padding packets would be one (packet 255). The start position 226 would be the 
distance between the beginning of MPEG file 134 and point 280. The start offset 234 
would be the distance between the start of packet 25 1 and point 280. The end offset 236 
would be the distance between point 282 and the end of packet 258. 

The tag information generated for each frame in an MPEG-1 file is illustrated in 
Figure 2c. Referring to Figure 2c, entry 214 includes data indicating the state of three state 
machines: a system state machine, a pack state machine, and a video state machine. 
Specifically, tag entry 214 includes the information shown in Table 4. 
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TABLE4 



DATA 


MEANING 


AMOUNT OF NON-VlDbO DATA 
221 


1 he amount ot non-video data (m bytes) 
contained within the start and end 
tioundanes of the name data for name 

"F". 


AMOUNT OF PAUDiNG OAl A TB 


Ihe amount ot paddmg data (m bytes) 
contained within the start and end 
boundaries of the frame data for frame 
r . 


PACK OFFSKT Ai" START 225 


i he ottset between the start boundary ot 
the frame data for frame "F" in the 
beginning of the pack packet that 
contains the start boundary for frame 


PACK RHMAlNiNG AT START 227 


ihe distance between the start boundary 
for frame "F" and the end of the pack 
packet that contains the start boundary 
of frame"?". 


PACK OFFSHi' Ai' HND 229 


The otiset between the end boundary tor 

fi^me "F" in the beginning of the packet 

that contains the end botmdary for frame 
iipti 


PACK REMAINING AT END 23 1 


1 he distance between the end botmdary 
tor trame r . and the end ot the pacK 
packet that contains the end boundary, of 

frame"?". 




ine distance (m oytes) oetween tne sian 
boundary for fi^e "F" and the end 
boundary for frame "?". 


rlC 1 UKJb o i AKl rUo 2jj 


1 he distance oetween tne start oi ine 
MPEG-1 file and the start boundary for 

frame '*F" 


PICTURHEND POS 237 


i he posmon, relative to the beginmng 
of the MPEG-1 file, of the end boundary 

for frame "?". 


FRAME TYPE 239 


The techmque used to encode the data 
that represents frame "F". 
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TlMl: VALUE 241 


The time, relauve to the beginning ot 
the movie, when frame "F" would be 
displayed during a normal playback of 
MPEG file 134. 




Indicates how tuil the decoder is (sent to 
the decoder to detennine when 
information should be moved out of the 
buffer in order to receive newly arriving 
inforaiation). 



The tag information includes data indicating the state of the relevant state machines 
at the beginning of video frames. However, the state machines employed by other digital 
audio-visual formats differ from those described above just as the state machines employed 
in the MPEG-1 format differ from those employed in MPEG-2. Consequently, the specific 
tag information stored for each frame of video will vary based on the digital audio- video 
format of the file to which it corresponds. 



THEMDS 

MDS 1 12 includes MDS server 11 0 and one or more non-volatile storage devices, 
10 such as disks 1 14. In the illustrated embodiment, MPEG file 134 is stored across numerous 
disks 1 14 to increase the fault tolerance of the system. Consider, for example, the multi- 
disk system 300 illiistrated in Figure 3. System 300 includes N+1 disk drives. An MPEG 
file is stored on N of the N+1 disks. The MPEG file is divided into sections 350, 352, 354 
and 356. Each section is divided imo N blocks, where N is the number of disks that will be 
15 used to store the MPEG file. Each disk stores one block from a given section. 

In the illustrated example, the first section 350 of the MPEG file includes blocks 
3 10, 3 12 and 3 14 stored on disks 302, 304 and 306, respectively. The second section 352 
includes blocks 316, 318 and 320 stored on disks 302, 304 and 306, respectively. The third 
section 354 includes blocks 322, 324 and 326 stored on disks 302, 304 and 306, 
20 respectively. The fourth section 356 includes blocks 328, 330 and 332 stored on disks 302, 
304 and 306, respectively. 

The disk 308 which is not used to store the MPEG file is used to store check bits. 
Each set of check bits corresponds to a section of the MPEG file and is constructed based 
on the various blocks that belong to the corresponding section. For example, check bits 
25 334 corresponds to section 350 and is generated by performing an exclusive OR operation 
on all of the blocks in the first section 350. Similarly, check bits 336, 338 and 340 are the 
products of an exclusive OR performed on all of the blocks in the section 352, 354 and 
356, respectively. 
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System 300 has a higher fault tolerance than a single disk system in that if any disk 
in the system ceases to operate coirectiy, the contents of the bad disk can be reconstructed 
based on the contents of the remaining disks. For example, if disk 304 ceases to function, 
the contents of block 312 can be reconstructed based on the remaining blocks in section 
5 350 and the check bits 334 associated with section 350. Similarly, block 3 1 8 can be 

constructed based on the remaining blocks in section 352 and the check bits 336 associated 
with section 352. This error detection and correction technique is generally known as 
"Redundant Array of Inexpensive Disks" or RAID. 

During real-time playback using RAID, video pump 120 reads and processes the 
1 0 ' MPEG file on a section by section basis so that all of the information is available to 

reconstruct any faulty data read from disk. Techniques for performing RAID in real time 
are described in U.S. Patent No. 5,623,595, entiUed "METHOD A3SID APPARATUS FOR 
TRANSPARENT, REAL TIME RECONSTRUCTION OF CORRUPTED DATA IN A 
REDUNDANT ARRAY DATA STORAGE SYSTEM" , the contents of which is 
15 incorporated herein by this reference. 

During normal playback operations, there is sufficient time to perform the disk 
accesses required to read an entire section while the data from the previous section is being 
transmitted in the MPEG data stream. However, .during fast forward and fast rewind 
operations, less than all of the data in any section will be sent in the MPEG data stream. 
20 Because less data is sent, the transmission of the data will take less time. Consequently, 
less time will be available to read and process the subsequent section. 

For example, assume that only one frame X from section 350 was selected for 
display during a fast forward operation. During the time it takes to transmit the segment 
for frame X, the data for the next selected frame Y is read and processed. Assume that the 
25 next frame Y is located in section 352. If the MPEG file is read and processed on a section 
by section basis (required for RAID), then all of the blocks in section 352 are read and 
processed during the transmission of the single frame X. Even if it were possible to read 
and process all of the blocks in section 352 in the allotted time, it may still be undesirable 
to do so because of the resources that would be consumed in performing the requisite disk 
30 accesses. 

In light of the foregoing, video pump 120 does not use RAID during fast forward 
and fast rewind operations. Rather, video pump 120 reads, processes and transmits only 
the data indicated in the commands it receives from the stream server 118. Thus, in the 
. example given above, only the frame data for frame Y would be read and processed during 
35 the transmission of the segment for frame X. By bypassing RAID during fast forward and 
fast rewind operations, disk bandwidth remains at the same level or below that used during 
normal playback operations. 

Since RAID is not used during real-time fast forward and fast rewind operations, 
faulty data cannot be reconstructed during these operations. Consequently, when the video 
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pump 120 detects that the data for a selected frame is corrupted or unavailable, the video 
pump 120 discards the entire segment associated with the problem frame. Thus, if the data 
associated with a frame cannot be sent, then the prefix and suffix data for the frame is not 
sent either. However, any padding packets that were to be sent along with the prefix or 
5 suffix data will still be sent. 

By sending data in entire "segments", conformance with the digital audio-visual 
format is maintained. In one embodiment, the video pump 120 will send down padding 
packets to fill the line to maintain the correct presentation rate. In the preferred 
embodiment, this behavior is selectable by the cUent. 

10 

DATA STRIPING 

The RAID techniques described above improve both throughput (because all data 
fcom all disks in an array are read in parallel) and reliability (due to error correction). To 
fiirther increase throughput, RAID may be used in conjunction with daU striping. Using 

1 5 • data striping, segments of logically sequential data are written to multiple physical devices 
(or sets of physical devices) in a round-robin fashion. The amount of data stored at each 
storage element in the round-robin sequence is referred to as a " stripe" . When each 
storage element in the round-robin sequence is an array of RAID disks, each segment of 
data is referred to as a RAID stripe. 

20 Figure 3B illustrates a system in which data striping is used in conjunction with 

RAID. The system of Figure 3B is similar to that of Figure 3 A with the exception that each 
of the disks in Figure 3 A has been replaced by a series of M disks. Thus, disk 302 has 
been replaced by disks 302-1 through 302-M. The segment portions stored on disk 302 
have been stored on disks 302-1 to 302-M in a sequential, round robin fashion. For 

25 example, assume that the MPEG file has been divided into 50 segments and that disk 302 
has been replaced with 25 disks. Under those circumstances, disk 302-1 would store the 
first portion of segments 1 and 26. Disk 302-2 would store the first portion of segments 2 
and 27, etc. 

Data striping increases throughput because different processes can be.reading from 
30 different disk arrays in parallel For example, one data pump may be reading the first 
segment of an MPEG file from the RAID array that includes Disk_l,l through 
Disk_l J^f+l , while another data pump is concurrently reading the second-segment of the 
same MPEG file from the RAID array that includes Disk_2,l through Disk_2 J4+1. 

For throughput performance reasons, reads and writes occur in discrete chunks, 
35 typically disk RAID stripes. In a typical digital video delivery system, each access unit is 
256kB or 2 megabits, and the content is 2Mb/sec MPEG. Consequently, each RAID stripe 
corresponds to approximately one second of video, though this could easily range from 
about .2 to 10 seconds per stripe depending on content bit rate and server configuration. 
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THE CLIENT 

Audio-visual infonnation delivery system 100 contains one or more clients, such as 

client 122. Client 122 generally represents devices configured to decode audio-visual 

information contained in a stream of digital audio-visual data. For example, client 122 
5 may be a set top converter boxes coupled to an output display, such as televisioiL Client 

122 includes a decoder 126 for decoding a digital data stream, and a control unit 124 for 

communicating information to the stream server 118. 

Stream server 118 is able to receive inforaiation from client 122 over a control 

network 140. Control network 140 may be any networic that allows communication 
10 between two or more devices. For example, control network 140 may be a high bandwidth 

network, an X.25 circuit or an electronic industry association (EIA) 232 (RS - 232) serial 

line. 

The client 122 communicates with the stream server 1 1 8 and database 1 16 via the 
control network 140. For example, client 122 may send a query to database 1 16 requesting 

15 information about what is currently available for viewing. The database 116 responds by 
sending the requested information back to the client 122. The user of client 122 may then 
select to view a particular audio-visual work begirming at a particular location and at a 
particular speed. Client 122 transmits requests to initiate the transmission of audio- visual 
data streams and control information to affect the playback of ongoing digital audio- visual 

20 transmissions through networic 140 to stream server 118. 

THE VIDEO PUMP AND STREAM SERVER 
The video pump 120 is coupled to the stream server 118 and receives commands 
&om the stream server 118. The video pump 120 is coupled to the disks 114 such that the 
25 video pump 120 stores and retrieves data from the disks 1 14. 

In addition to communicating with the stream server 1 18, the client 122 receives 
information from the video pump 120 through a high bandwidth network 150. The high 
bandwidth network 150 may be any of type of circuit-style network link capable of 
transferring large amounts of data. A circuit-style network link is configured such that the 
30 destination of the data is guaranteed by the underlying networic, not by the transmission 
protocol. For example, the high bandwidth networic 150 may be an asynchronous transfer 
mode (ATM) circuit or a physical type of line, such as a Tl or £1 line. In addition, the 
high bandwidth network 150 may utilize a fiber optic cable, twisted pair conductors, 
coaxial cable, or a wireless communication system, such as a microwave communication 
35 system. 

Network 1 50 may alternatively be a relatively low bandwidth networic, or a ' 
combination of high bandwidth and low bandwidth communication mediums. For 
example, a portion of network 150 may comprise a relatively high bandwidth ATM circuit. 
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while a relatively low bandwidth device such as a 28.8K modem is used downstream to 
deliver the video information from the network to the client 122. 

The audio-visual information delivery system 100 permits a server, such as the 
video pump 120, to transfer large amounts of data from the disks 1 14 over the high 
5 bandwidth networic 150 to the client 122 with minimal overhead. In addition, the audio- 
visual information delivery system 100 permits the client 122 to transmit requests to the 
stream server 1 18 using a standard network protocol via the control network 140. In a 
preferred embodiment, the underlying protocol for the high bandwidth networic 150 and the 
control network 140 is the same. The stream server 118 may consist of a single computer 
1 0 system, or may consist of a plurality o f computing devices configured as servers. 

Similarly, the video pump 120 may consist of a single server device, or may include a 
plurality of such servers. 

To receive a digital audio-visual data stream from a particular digital audio-visual 
file, client 122 transmits a request to the stream server 118. In response to the request, the 
1 5 stream server 1 1 8 transmits commands to the video pump 120 to cause video pump 120 to 
transmit the requested digital audio-visual data stream to the client that requested the 
digital audio- visual data stream. For live feeds, the video server 106 will be storing the 
video stream into the video file 1 34 at the same time tlie video pump 120 is sending a video 
stream fix)m the file 134 to the client 122. 
20 The commands sent to the video pump 1 20 from the stream'server 1 18 include 

control information specific to the client request. For example, the control information 
identifies the desired digital audio-visual file, the beginning ofiEset of the desired data 
within the digital audio-visual file, and the address of the client. In order to create a valid 
digitel audio-visual stream at the specified offset, the stream server 1 1 8 also sends "prefix 
* . 25 data" to the video pump 120 and requests the video pump 120 to send the prefix data to the 
client. As shall be described in greater'detail hereafter, prefix data is data that prepares the 
client to receive digital audio-visual data from the specified location in the digital audio- 
visual file. 

The video pump 120, after receiving the commands and control information from 
30 the stream server 1 1 8, begins to retrieve digital audio-visual data from the specified 
location in the specified digital audio-visual file on the disks 1 14. For the purpose of 
explanation, it shall be assumed that audio- visual information delivery system 100 delivers 
audio- visual information in accordance with one or more of the MPEG fonnats. 
Consequently, video pump 120 will retrieve the audio-visual data from an MPEG file 134 
35 onthe disks 114. 

The video pump 120 transmits the prefix data to the client, and then seamlessly 
transmits MPEG data retrieved from the disks 114 beginning at the specified location to the 
chent. The prefix data includes a packet header which; when followed by the MPEG data 
located at the specified position, creates an MPEG compliant transition packet. The data 
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that follows the first packet is retrieved sequentially from the MPEG file 134, and will 
therefore constitute a series of MPEG compliant packets. The video pump 120 transmits 
these packets to the requesting client via the high bandwidth network ISO. 

The requesting client receives the MPEG data stream, beginning with the prefix 
5 data. The client decodes the MPEG data stream to reproduce the audio-visual sequence 
represented in the MPEG data stream. 

PREMATURE READ AVOIDANCE 
When client 122 is playing an MPEG stream at the same time the MPEG stream is 

10 being generated by encoder 101, safeguards should be taken to ensure that client 122 does 
not stall (because it has reached the end of valid content data) or play bad data (because it 
has read beyond the end of the currently available content data). If the video pump 120 
prematurely reads a stripe of disks 1 14, video pump 120 will send invalid data to the client 
122, resulting in the display of unintended content or garbage (dirty content). Such a 

15 premamre read will occur if, for example, a user requests display of a portion of the video 
stream that has not yet been stored on disks 114. To prevent this, end-of-file information 
for MPEG file 134 is maintained to indicate the current end-of-file 134. As more content 
data is added to file 134, the end-of-fiie infomiation is updated so that the new data may be 
accessed. 

20 One approach to avoid premature reads is to repeatedly update a table of contents 

on disks 1 14 with a new end-of-file value, and have the video pump 120 check this value 
before reading stripes from disks 1 14. The MDS server 1 10 updates the end-of-file to 
indicate that the content file 134 includes new content only after it has been verified that 
the new content has been successfully stored to disks 1 14. Unfortunately, unless this end- 

25 of-file information is guaranteed to be held in dynamic memory, this technique leads to a 
jitter in the latency period of updates that is difficult to predict. 

Another approach to avoid premature reads is for the MDS server 1 10 to actively 
Uransmit the new end-of-file information to all processes that are reading the content. Thus, 
MDS server 1 10 stores content data into file 134 on disks 1 14, waits for verification that 

30 the content has been stored, and then transmits messages indicating the existence of the 
newly stored content to all processes reading the content data (e.g. video pump 120). The 
MDS server 110 may make such end-of-file notification messages periodically (e.g. after 
every 5 seconds) or after a predetermined amount of new content data has been 
successfully stored (e.g. after every 1 Megabyte). Unfortunately, notification times will 

35 also jitter due to variations in the content arrival times, which is a function of the encoder 
101 and the network between the encoder 101 and the video server 106. 

According to one embodiment, the tag information is used to indicate the current 
end-of-file. Specifically, video server 106 effectively updates the end-of-file of file 1 34 by 
sending tag information from tag buffer 108 for storage by MDS 1 12. As soon as the tag 
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information that corresponds to a particular portion of content has been transmitted by 
video server 106, the video pump 120 is free to perform a seek to that particular portion of 
video. Until the tag information that corresponds to a particular portion of video is 
released, video pump 120 may not perform a seek to the corresponding portion of video. 
5 Because the newest tag information indicates the current end-of-file, newly comiected users 
may simply seek to the content associated with the newest tag information, and begin 
playing the feed at the real-time rate. 

MINIMUM TAG DELAY PERIOD 
10 ' To prevent client 122 from stalling or playing bad data, the transmission of tag data 

from tag buffer 108 to MDS 112 is delayed. Preferably, the duration of the delay is long 
enough to ensure that the associated content data will not be prematurely accessed. On the 
other hand, delaying the tag data longer than necessary increases the latency between when 
content is encoded and when users can seek or scan to the content. Consequently, it is 
15 desirable to determine a minimum tag delay period, and to buffer tag data in tag buffer 108 
for the minimum tag delay period. The minimum tag delay period for a tag data item is 
determined by the maximum latency associated with delivering the corresponding content 
data from encoder 10 1 to video pump 120. 

Video server 106 includes a network buffer 152 and a write buffer 154. Typically, 
20 the video server 106 will be reading content data from channel 128 into network buffer 152 
at the same time that video server 106 is writing content data from write buffer 154 to disks 
1 14. In embodiments that use RAID storage techniques, content data is received and 
buffered within video server 106 in units tiiat correspond to one RAID stripe. 

Video pump 120 includes a prefetch unit 146 and a buffer 144. Video pump 120 
25 reads content data from disks 1 14 asynchronously. To read content data, prefetch unit 146 
requests the transmission of a particular portion of content data, and disks 1 14 respond by 
either sending the requested content data or by indicating that the requested data cannot be 
sent. Some latency occurs between the time the prefetch unit 146 requests data, and the 
time the data is received by video pump 120. 
30 When content data from file 134 arrives at video pump 120, video pump 120 stores 

the content data from file 134 into the buffer 144. As bandwidth becomes available on 
networic 150, video pump 120 transnuts content data from the buffer 144 over network 150 
to client 122. As with the video server 106, content data is pre-fetched and buffered within 
video pump 120 in units that correspond to one RAID stripe when RAID storage 
35 techniques are used. 

As explained above, the video pump 120 is typically copying data from one RAID 
stripe into network buffers and prefetching the following stripe. Likewise, the video server 
106 is typically writing one RAID stripe of content to the data store and receiving data 
from the network into a second memory buffer. Consequentiy, there are typically four 



wo 99/21364 



-20- 



PCTAJS98/22018 



RAID stripes " in transif ' , so the latency between when any content data is generated and 
when it is available to be played is approximately the time it takes to deliver four RAID 
stripes worth of data. 

RAID stripes are usually 128K bits or 256K bits per disk. The combined total of all 
5 disks in a RAID stripe is therefore 1 to 2 Megabits. For typical MPEG files, each raid 
stripe will correspond to approximately one second of video. Consequently, having four 
RAID stripes in transit results in a minimum latency of approximately 4 seconds. 

The implication for tag data is that a given tag may only be released by the video 
server 106 for use by other entities when the corresponding content is available to be 
10 played (i.e. has been successfully stored on disk for two seconds). Therefore, in a video 
delivery system where the content dehvery has a four second latency, the tag data retained 
in tag buffer 108 is transmitted no earlier than four seconds after the generation of the 
corresponding content 

According to one embodiment, jitter and stalling are both avoided by transmitting a 
15 batch of tag data from tag buffer 108 to MDS 1 12 every twelve seconds. The tag data 

batch transmitted at every twelve second interval includes all tag information in tag buffer 
108 that is at least twelve seconds old. Tag data that is less than twelve seconds old is 
retained in tag buffer 108 and transmitted to MDS 1 12 in a batch at the end of the next 
twelve second interval. MDS server 1 10 sends the tag data to the various entities (e.g. 
20 video pump 120) that are reading video file 134, and then stores the tag information on 
disks 114. 

DIGITAL CHANNELS 
Video files generated for specific audio-visual works, such as sporting events, will 
25 be of fmite length. Consequently, their corresponding content files will also consume a 
finite amount of storage making it practical to perpetually store the entire content file for 
later viewing. However, a Uraditional television "channel" is composed of a never-ending 
sequence of audio-visual works. Perpetually retaining all of the content of a digital channel 
would continuously consume storage at an unacceptably high rate. Oii the other hand, it is 
30 desirable to allow users to view programs that they may not have been able to view at die 
time the programs were originally broadcast For example, it would be desirable for a 
viewer to have access to the last 24 hours of programming that was broadcast over a digital 
channel. According to an embodiment of the invention, historical viewing for an infinite 
feed is provided through the use of a continuous finite buffer, where older data " expires" 
35 and is overwritten with new data. 

CONTENT EXPIRATION 
In order to have continuous buffer of data, for instance the last 24 hours of 
Lifetime, Television for Women, older content needs to be deleted along with the 
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coiresponding tags. Various approaches may be used to impiement such a continuous 
buffer. 

With respect to the content data, the simplest approach to implement a continuous 
buffer is to create a single file large enough to hold 24 hours of footage. The file is then 
treated as a circular buffer. Specifically, after the creation of the initial 24 hour file, the 
MDS server 1 10 would establish the beginning of the file as the current " insertion point" . 
The MDS server 110 would then store new content data over the old data at the insertion 
point, and move the insertion point to the end of the new data. When the insertion point 
hits the end of the file, it wraps around again to the beginning of the file. 

Unfortunately, this single-file circular buffer approach makes it difficult to grow or 
shrink the time of the file. For example, assume that the insertion point is in the middle of 
the file, and a decision is made to expand the file to cover 48 hours. Under these 
circumstances, the MDS server 1 10 could not begin to extend the time covered for another 
1 2 hours, when the insertion point would have reached the end of the file. Using the single 
circular buffer approach, it is also difficult to detect if a client has paused and had the 
"horizon" move over their position, such that when they resume the content they were 
watching has been overwritten. 

Figure 4 illustrates an alternative, more flexible approach to buffering a 
predetermined amount of an infinite video feed. Referring to Figure 4, the content data is 
stored in a ^oup of smaller files 402-414. Each of the smaller files stores a subset of the 
buffered content data. In the illustrated embodiment, each of files 402-412 store two hours 
worth of content. File 414 currently stores one hour of content. The current insertion point 
is at the end of file 414. When file 414 reaches two hours of content, file 414 will be 
closed and a new content file will be created. As content files age, the older content files 
are deleted to free up disk space for new files. During playback, files are joined together 
seamlessly by the video pump as the content data is sent to the client. 

When the buffering technique illustrated in Figure 4 is used, a lenient expiration 
pohcy can be set. Specifically, a policy may be established that a file is not deleted until 
all clients have finished with the (file and any files that precede the file). For example, 
assume thai users are allowed to access the last 12 hours of a feed. When file 414 is 
completed, files 404-414 will contain the most recent 12 hours, so file 402 is no longer 
required. However, a client may currently be viewing the contents of file 402. 
Consequently, file 402 is not immediately deleted. Rather, new clients are prevented from 
accessing file 402, but the client currently accessing file 402 is allowed to finish playing 
file 402. When the last client has. finished playing file 402, the file 402 is deleted. 

To put a cap on the number of existing files, a time limit may be established for 
cUents to finish playing old files. For example, when file 414 is completed, not only are 
new chents prevented &om accessing file 402, but the chents currently accessing file 402 
are given two hours to finish playing file 402, At the end of two hours, file 402 is then 
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deleted to &ee up disk space without regard to whether any clients were still reading file 
402. 



TAG EXPIRATION 

5 When a content file (e.g. file 402) is deleted, the tags that correspond to the deleted 

content file are considered " expired", and therefore can also be deleted. Ideally, tags are 
stored in a format, such as a database table, that allows easy deletion of old tags as well as 
the addition of new ones. Unfortunately, the overhead associated with storing and 
retrieving tags from a database table may be too expensive to be practical under live feed 

10 conditions. For ease and speed of access, therefore, tags are typically stored in a flat file. 

Referring lo Figure 5, it illustrates a flat tag file 500. The flat tag file 500 includes a 
header 502 followed by a set of tags 504. The header 502 indicates infonnation about the 
contents of tag file 500, including the set of content files to which the tags within tag file 
500 correspond. 

15 As new tags arrive, the tags are appended to tag file 500. Because tag file 500 is 

associated with a continuous feed, tag file 500 will grow indefuiitely unless a mechanism is 
provided for deleting expired tags. However, tag file 500 itself should remain valid even 
after the expiration of some tags (e.g. lags 510) within the tag file 500, since clients may 
continue to access and use the tags 512 within tag file 500 that have not yet expired. 

20 Therefore, the expiration mechanism cannot simply delete the expired tags 5 10 fiom the 
tag file 500. 

Rather than directly delete the expired tags from within tag file 500, a temporary 
tag file 514 is created by constructing a new header 506 and appending to the new header 
506 a copy of the unexpired tags 512 from the old tag file 500. The new header 506 
25 includes the same information as the old header 502, except tliat data within header 502 
indicates that tag file 500 includes tags for the deleted content file, while data within 
header 506 does not. 

While new tag file 514 is being created, new tag data is appended to both the new 
tag file 514 and the old tag file 500. After the new tag file 514 is created, new tag data is 
30 appended to the new tag file 5 14 rather than the old tag file 500. To ensure that the new tag 
data is appended after tag data 512, storage for the to-be-copied tags 512 is preallocated in 
the new tag file 514, and the new tags are appended after the preallocated storage while the 
existing tags 5 12 are copied into the preallocated storage. 

When all of the unexpired tags 5 12 have been copied to the new tag file 5 14, the 
35 old tag file 500 is closed and the new tag file 5 14 is renamed over the top of the old tag file 
500. After the new tag file 514 has been renamed, the tag file readers (e.g. stream server 
118) that were using the old tag file 500 are reset based on the information contained in the 
header of the new tag file 514. According to one embodiment (the "push model"), 
messages are sent to the tag file readers to expressly inform them that the t^ file has been 
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modified, and that they should update themselves based on header information in the new 
tag file 514. 

According to an alternative "pull model" embodiment, the tag file readers are not 
expressly informed. Rather, they are configured to read and update themselves based on 
5 the header information of the new tag file if they ever fail in an attempt to read a tag. The 
pull model ^proach has the benefit that it avoids the transmission of messages which, 
under many circumstances, are not necessary. 

When tags associated with a particular content segment are deleted, clients may 
continue to view the content segment. However, the clients will not be able to perform 
10 non-sequential access operations that require the deleted tag mformation, such as fast 
forward and rewind. 

TEVESTAiMPING 

Tag information includes timestamp information for each of the frames in the 
15 corresponding content data. For the purposes ofdecoding, the timestamp information 
typically represents time relative to the beginning of a feed (i.e. the "presentation time"), 
and is mapped to the byte offset in the content file of the frame that corresponds to that 
presentation time. However, for continuous feeds, such relative time values are not 
meaningful. For example, a user would want to request playback begiiming at Jan 21 , 
20 1997 16:30:23, rather tiian beginning at 5,345,789.76 seconds fi-om the time a station 
began broadcasting. 

According to one embodiment of the invention, absolute time values are supported 
by storing an absolute time value that corresponds to llie "zero" relative time value. 
Therefore, when a client specifies playback from an absolute lime, the absolute time value 
25 associated with " zero" is subtracted from the specified absolute time value to yield a 

relative time value. The relative time value is then used by stream server 1 1 8 to identify . 
the appropriate tag information, and tiie tag mformation is used by stream server 1 18 to 
cause video pump 120 to begin sending content from the appropriate location in the content 
file 134. ... 
30 Typically, the transport formats of digital video provide a fixed number of bits (e.g. 

33 bits) to represent timestamps. For continuous feed environments, the relative timestamp 
values will inevitably reach numbers that cannot be represented by the number of bits 
available in the transport format When this occurs, the timestamp values "wrap" and 
begin again at zero. 

35 To address the wrapping problem, a higher-precision wrap value (e.g. 64 bits) is 

maintained. When performing a seek or other non-sequential access, the stream server 118 
uses the higher-precision timestamp values. When transmitting Content to a client, the 
video pump 120 sends the lower-precision timestamps. 
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The video encoding and delivery techniques described herein empower users with 
control of functions that were previously in the exclusive domain of program providers. 
- For example, program providers currently determine which plays of a SuperBowl will be 
replayed to viewers, the speed at which they will be replayed, and how many times they 
5 will be replayed. 

However, viewers may have strongly differing opinions as to which plays merit 
multiple viewings. For example, two viewers may dispute the accuracy of a particular call. 
However, the program provider may not consider the play that gave rise to the call to be 
significant enough to replay the play. Using the techniques provided herein, viewers may 
10 determine for themselves which plays should be immediately replayed, at what speed they 
are replayed, and how many times they are replayed. 

In the foregoing specification, the invention has been described with reference to 
specific embodiments thereof. It will, however, be evident that various modifications and 
changes may be made thereto without departing from the broader spirit and scope of the 
1 5 invention. The specification and drawings are, accordingly; to be regarded in an illustrative 
rather than a restrictive sense. 
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CLAIMS 

What is claimed is: . 

1. A digital video delivery system comprising: 

an encoder configured to receive visual information; 
5 said encoder being configured to generate content data that represents the visual 

information in a digital video format; and 
said encoder being configured to generate control data in parallel with said content 
data, said control data indicating locations of firames contained in said 
content data. 

10 2. The system of Claim 1 further comprising: 

a video pump coupled between the encoder and a communication channel; 
said video pump being configured to U"ansmit said content data to a cUent over said 
communication channel, and to provide said cUent non-sequential access to 
the visual information based upon said control data. . 

15 3. The system of Claim 2 further comprising a video server coupled between said 
encoder and said video pump, said video server causing said control data to be 
made available to said video pump only after delaying said control data relative to 
the corresponding content data. 

4. The system of Claim 2 further comprising a storage system coupled between said 
20 encoder and said video pump, said storage system including a server that transmits 

said content data to said video pump when requested by said video pump, and 
transmits end-of-file information for said content data to said video pump without 
said video pump requesting said end-of-file data. 

5 . The system of Claim 2 wherein said encoder includes: 

-25 a CODEC that generates digital information in response to said visual information; 

a multiplexer coupled to said CODEC; 

said multiplexer arranging said digital information generated by said CODEC 

according to said digital video format; and 
said multiplexer generating said control data to indicate how said multiplexer 
30 arranged said digital information. 



6. 



The system of Claim 2 wherein said CODEC is a real-time CODEC and said visual 
infonnation is from a live feed. 
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7. The system of Claim 2 further comprising: 

a video server operatively coupled to receive the content data and control data firom said 
encoder and to transmit said content data and control data; 
an MDS system coupled to the video server; 
5 said MDS including one or more storage devices; 

said MDS system being configured receive said content data from said video server 
and store said content data on said one or more storage devices, and to 
receive said control data from said video server and store said conttol data 
on said one or more storage devices; 
10 said video pump being configured to read said content data from said one or more 

storage devices of said MDS system. 

8. A method for providing non-sequential access to visual information that is being 
digitally encoded in a digital data stream, wherein said digital data stream includes a 
sequence of video frame data, each video frame data in said sequence of video frame data 
1 5 corresponding to a video frame of said visual information, the method comprising the 
computer-implemented steps of: 

generating said digital data stream with an encoder; 

causing said encoder to generate tag data that indicates locations of said video 
firame data within said digital data stream; 
20 storing said digital data stream at a location from which the digital data stream is 

delivered to a client; and 
storing said tag data at a location from which the tag data may be used to provide 
the client non-sequential access to the digital data stream. 

9. The method of Claim 8 further comprising the steps of: 
25 selecting a selected set of video frames within said digital data stream based on said 

tag data in response to a request for non-sequential access by said client; 
constructing a second digital data stream that includes the video frame data that 
corresponds to each video frame of said selected set of video frames; and 
transmitting said second digital data stream to said client 

30 10. The method of Claim 8 further comprising the step of making said tag data 

available to a pump that sends said second digital data stream only after delaying 
said tag data relative to the corresponding frame data in said digital data stream. 

1 1. The method of Claim 8 further comprising the steps of transmitting said digital data 
stream to a video pump when requested by said video pump, and transmitdng end- 
35 of-file information for said digital data stream to said video pump without said 

video pump requesting said end-of-file data. 
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12. The method of Claim 8 wherein said encoder includes: 

a real-time CODEC that generates digital information in response to said visual 

information; 
a multiplexer coupled to said real-time CODEC; 
5 said multiplexer arranging said digital information generated by said real-time 

CODEC according to a digital video format; and 
said multiplexer generating said tag data to indicate how said nfliltiplexer arranged 

said digital information. 

13. The method of Claim 8 wherein said step of generating said digital data stream with 
10 an encoder includes encoding visual information from a live feed. 

14. A computer-readable medium having stored thereon sequences of instructions for 
providing non-sequential access to visual information that is being digitally 
encoded in a digital data stream, wherein said digital data stream includes a 
sequence of video frame data, each video frame data in said sequence of video 

1 5 fi-ame data corresponding to a video frame of said visual information, the sequences 

of instructions including instructions for performing the steps of: 
while said digital data stream is being generated by an encoder, causing said 

encoder to generate tag data that indicates locations of said video fi^e data 
within said digital data stream; 
20 storing said digital data stream at a location from which the digital data stream is 

delivered to a client; and 
storing said tag data at a location from which the tag data may be used to provide 
the client non-sequential access to the digital data stream. 

15. The computer-readable medium of Claim 14 further comprising instructions for 
25 performing the steps of: 

selecting a selected set of video frames within said digital data stream based on said 

tag data in response to a request for non-sequential access by a client; 
constructing a second digital data stream that includes the video frame data that 
coaesponds to each video frame of said selected set of video frames; and 
30 transmitting said second digital data stream to said client. 

1 6. The computer-readable medium of Claim 1 4 further comprising sequences of 
instructions for performing the step of making said tag data available to a pump that 
sends said second digital data stream only after delaying said tag data relative to the 
corresponding frame data in said digital data stream. 
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1 7. The compuier-readable medium of Claim 14 further comprising sequences of 
instructions for performing the steps of transmitting said digital data stream to a 
video pump when requested by said video pump, and transmitting end-of-file 
information for said digital data stream to said video pump without said video 
pmnp requesting said end-of-flle data. 

1 8 . The computer-readable medium of Claim 1 4 wherein said encoder includes: 

a real-dme CODEC that generates digital information in response to said visual 

information; 
a multiplexer coupled to said real-time CODEC; 

said sequences of instructions including instructions which cause said multiplexer 



according to a digital video format; and 
said sequences of instructions including instructions which cause said multiplexer 
to generate said tag data to indicate how said multiplexer arranged said 
digital information. 



to arrange said digital information generated by said real-time CODEC 



19. 



The computer-readable medium of Claim 14 wherein said step of generating said 
digital data stream with an encoder includes encoding visual information from a 
live feed. 



wo 99/21364 



PCT/US98/220I8 



1/8 



til 



2 C4 



1 








a: 




o 








UJ 






1- 






o 


CO 






CM 




o 


CM 


z 








CJ 




o 








UJ 




o 








o 







UJ 




CO 




< 
m 


CO 



















i 

UJ CO 
CO o 

o 

UJ 

a 
> 



ORK 




CM 








Ll_ 






in 


lET 


BUI 




1 = 
> m 




z 









Q a; g 

i£ s 2 

m 



o 



O 

o 



CD 



CO CM 



tr 

UJ 

a ^ 
o o 

z 

UJ 



5 ^ 

3 O 
2 



UJ CM 



o 

UJ 
Q 
> 



UJ 

o 
cr 

o 



SUBSTITUTE SHEET (RULE 26) 




SUBSTITUTE SHEET (RULE 26) 



W099/213i 



PCTAJS98/22018 



3/8 



TAu rlLc. Ij2 


/ 2JAFORMPEG-2 


FILE TYPE IDENTIFIER 202 






LENGTH INDICATOR 204 

BIT RATE INDICATOR 206 

PLAY DURATION INDICATOR 208 


/ PES OFbSfi AT THE START 
/ OF PICTURE 217 

/ PES OFFSET AT THE END 
f OF PICTURE 219 




FRAME NUMBER INDICATOR 210 / 




STREAM ACCESS INFORMATION 212 / 


vmpnr AVPw ^n-ATF. 




INTTIAL MPEG TIME OFFSET 213 / 


PICTURE SIZE 220 






START POSITION 226 
TIME VALUE 228 




TAG FROM FRAME 1 / 




TAG FROM FRAME 2 / 


FRAME TYPE 232 




• / 


TIMING BUFFER INFO 238 




• 




• 


TPAM^ORTTAVrn? ctatt: 
START OFFSET 234 




214 TAG FROM FRAME T" 1 


END OFFSET 236 




• \ 


DISCONTINUrrY FLAG 230 






CURRENT CONTINUrTY 
COUNTER 215 

\ # NON-VIDEO PACKETS 222 




TAG FROM FRAME N 


\ # PADDING PACKETS 224 









Fig. 2B 



SUBSnrUTE sheet (rule 26) 



wo 99/21364 



PCTAJS98/22018 



4/8 



TAG FILE 132 



FILE TYPE IDENTIFIER 202 

LENGTH INDICATOR 204 

BIT RATE INDICATOR 206 

PLAY DURATION INDICATOR 208 
FRAME >aJMBER INDICATOR 210 
STREAM ACCESS INFORMATION 212 
tMITIAL MPEG TIME OFFSET 213 



TAG FROM FRAME I 



TAG FROM FRAME 2 



2J^ TAG FROM FRAME T 



TAG FROM FRAME N 



2iAF0R MPEG-1 



AMOUNT OF NONVIDEO 
DATA 221 

AMOUNT OF PADDING 
DATA 223 



PACK OFFSET 
AT START 225 

PACK REMAINING 
AT START 227 

PACK OFFSET AT END 229 

PACK REMAINING 
ATEND231 



VmFO STATE 
PICTURE SIZE 233 
PICTURE START POS 235 
PICTURE END PCS 237 
FRAME TYPE 239 
TIME VALUE 241 
TIMING BUFFER INFO 243 



Fig. 2C 



SUBSirrUTE sheet (rule 26} 



wo 99/21364 



5/8 



PCTAJS98/22018 



3DQ 

\ 















350 


BLOCK 

(1.1) 
■ 310 


BLOCK 

(1.2) 
312 




BLOCK 

(1,N) 
314 


CHECK • 
BITS 1 
334 1 


r- - 












352 ; 


BLOCK 

(2.1) 
316 


BLOCK 
(2,2) 
318 


■ • ■ 


BLOCK 

(2.N) 
320 


CHECK 1 
BITS ' 
336 














354 1 


BLOCK 

(3,1) 
322 


BLOCK 

(3.2) 
324 


• • ■ 


BLOCK 
(3,N) 
326 


CHECK 
BITS 
338 


I- 

356 ; 


BLOCK 

(4.1) 
328 


BLOCK 
(4.2) 
330 


« • ■ 


BLOCK 
(4.N) 
332 


CHECK 
BITS 
340 




DISKJ 

302 


DISKJ 

304 




DISK_N 

306 


DISK_N+1 

308 



Fig. 3A 



SUBSTITUTE SHEET (RULE 26) 



M^O 99/21364 



PCTAJS98AZ2018 



6/8 



r • 












350 1 


BLOCK 

(1.1) 
310 


BLOCK 
(1.2) 
312 


• • • 


BLOCK 
(1.N) 
314 


CHECK 
BITS 1 
334 I 


b - 


DISKJ.I 
- 302-1 


DISKJ5 
304-1 




DISKJ,N 
306-1 


DISKJ.N+1 
308-1 




















352 \ 


BLOCK 

(2.1) 
316 


BLOCK 

(2^) 
318 


• t • 


(2.N) 
320 


BPiB ; 

336 1 




DISK.2,1 
302-2 


DISK,i2 
304-2 




306-2 


DISK ZN+1 
308-2 
























354 1 


BLOCK 
322 


BLOCK 

(32) 
324 


t • • 


BLOCK 

(3.N) 
326 


CHECK 
BITS 
338 




D1SKJ,1 
302-3 


DISKJ,2 
304-3 




DISK.3,n' 
306-3 


DISK.3.N+1 
308-3 




• 
• 
• 


• 
• 
• 


• ■ 
« • 

• • 














356 J 


BLOCK 

(4.1) 
328 


BLOCK 

(42) 
330 


• • ■ 


BLOCK 

(4.N) 
332 


CHECK 
BITS 
340 




DISK.M.1 
302-M 


DISK_M2 
304-M 




DISK M,N 
306-M 


DISK MN+1 
308-M 



Fig. 3B 



SUBSTTTUTE SHEET (RULE 26) 



PCT/US98fl20l8 
WO 99/21364 



7/8 




SUBSTITUTE SHEET (RULE 26) 



wo 99/21364 



PCTA;S98/22018 




SUBSTTTUTE SHEET (RULE 26) 



ATIONAL SEARCH REPORT 



ti .ational Appilcation No 

PCT/US 98/22018 



A. CLASStFtCATION OF.SUBJECT MATTER 

IPC 6 H04N7/173 



Actnnling to Intamationai Patent Oasitflcalion (IPC) or to both natJonal dasaiflcation and IPC 



B. FIELDS SEARCHED 



Minimum documentation saarchad (dasailication system foOowaa by dassitication symbols) 

IPC 6 H04N 



Oocumentation saarened other tiian minimum oocumentatton to the extant thai stxn documsnts are included in the fields aaarched 



Bacuonic data base consuted during the intemationaj seareh (name of data bass and, wnare practical search terms usad) 



C. DOCUMENTS CONSIOEBED TO BE RELEVAMT 



Catagory * 



Citatnn ot document, wnh indication, where appropriate, of the relevant passages 



RelBvanttodaimNo. 



X 
A 



US 5 659 539 A (PORTER MARK A ET AL) 

19 August 1997 

cited in the application 

see abstract 

see column 7, line 25 - column 7, line .53 
see column 19, line 21 - column 19, line 
42 

see column 22, line 60 - column 23, line 2 
see figures IB, 7 

EP 0 748 122 A (IBM) 11 December 1996 



see abstract 

see page 3, line 17 - page 3, line 29 
see page 6, line 28 - page 8, line 2 
see page 8, line 48 - page 8, line 53 
see figures 3-6 



1.2.4-9. 
11-15, 
17-19 
3,10,16 



1,2,5-9, 

12-15. 

18,19 



m 



Further documents aro Bstad in the contirruation ol box C. 



ID 



Patant tamily members are fisted in annex. 



* Special catagonaa oi cited documents : 

'A* document defining the general state Ol the art whicn is not 

considered to be of particular relevance 
•E* eartief document but pubBsnad on Of alter the international 
filing date 

V document which may throw doulDts on priority claim(s) or 
which is citad to establish the puDticaUon date of another 
citation or otfier special reason (as spedRed) 
"O* document ref emng to an oral tfsclosura. use. exhMtion or 

other means \ 
"P" documertpubBshed prior to lhaimsmafional taingdatabut 
later than iha pnonty date claimed 



T later doeumeni published attar the international fBing data 
or prioitty date and not in conf Bet with the applicatien but 
Cited to understand the prbidpie or theory undertying the 
invention 

"X" document of particular relevance: the claimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step When iha document is talcenalone 

■V documem ot particular relevance: the claimed inweniion 

cannot be considered to involve an inventive stop when (ha 
documem is combinad wtth one or more other such docu- 
ments, such cofflbinaiion being obvious to a person skiDed 
in the art. 

'V documem member of the same pattrtiafrtly 



Date ot the actual completion of the international search 



28 January 1999 



Data ot maib^ of the international search report 



04/02/1999 



I Nameandmaittngaadiossof ihalSA 

European Patent Offlce. P.B. 5818 Patendaan 2 
ML-22B0HVBi)$wiik 
Tel. (♦31-70) 34O-204a T«. 31 651 epo nl. 
Fax: (^^-70) 340-3016 



Authonzad officer 



Hampson, F" 



Fdmi PCrnSMSiO (aaeend tfMM) (Jiiy 19S2) 



page 1 of 2 



ATIONAL SEARCH REPORT^ 



C^CofiUnuMion) DOCUMENTS CONSIDERED TO BE RELEVANT 



In flional Appiicstion No 

PCT/US 98/22018 



Caiagorf * Citation at document, with irKlication.wtiefv apprepnats. ol ttw raievant passaga* 



Raiavara to daim No. 



WO 94 07332 A (SONY CORP) 31 March 1994 
see abstract 

see page 28, line 21 - page 33, line 23 
see page 38, line 10 - page 40, line 4 
see figures 11,12 



1,8,9, 
14,15 



Foini PCTASA/S1 0 (oonamaen ol svcand chMQ giriy 1992) 



page 2 of 2 



ATIONAL SEARCH REPORT 

Infonnatten on patent family members 



Patent document 
died in search report 



Pub&cation 
date 



US 5659539 



19-08-1997 



EP 0748122 A 11-12-1996 



WO 9407332 A 31-03-1994 



11 ationaJ Appilealion No 

PCT/US 98/22018 



Patent family 
member(s) 



CA 
EP 
WO 



US 

JP 



JP 
JP 
JP 

AT 
AU 
AU 
DE 
OE 
EP 
EP 
JP 
US 
US 
US 



2197323 A 
0781490 A 
9704596 A 



5721878 A 
9009239 A 



2785220 B 
6325553 A 
6164522 
170354 
669563 B 
4833393 A 
69320620 
69320620 
0622002 
0794667 
6267196 
5455684 
5504585 
5568274 



A 
T 



Publication 
date 



06-02-1997 
02-07-1997 
06-02-1997 

24-02-1998 
10-01-1997 




Foffli PCmSAaiO (patwe tamy mnm gidy 1992) 



